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HIGH DEGREES IN RANDOM RECURSIVE TREES 


LOUIGI ADDARIO-BERRY AND LAURA ESLAVA 


Abstract. For n > 1, let T n be a random recursive tree (RRT) on the vertex set 
[n] = Let deg T (u) be the degree of vertex v in T n , that is, the number 

of children of v in T n . Devroye and Lu [6] showed that the maximum degree A n of 
T n satisfies A n /|_log 2 nJ —¥ 1 almost surely; Goh and Schmutz [7] showed distributional 
convergence of A n — [l°g 2 n J along suitable subsequences. In this work we show how a 
version of Kingman’s coalescent can be used to access much finer properties of the degree 
distribution in T n . 

For any iGZ, let X^ = |{u E [n] : deg Tn (v) = [lognj +£}|. Also, let ?bea Poisson 
point process on R with rate function A(cc) = 2~ x ■ In 2. We show that, up to lattice 
effects, the vectors (X^ n \ i G Z) converge weakly in distribution to (V[i,i-\- 1), i E Z). 
We also prove asymptotic normality of X^ when i = i(n) —>■ — oo slowly, and obtain 
precise asymptotics for P (A n — log 2 n > i) when i{n) —> oo and i(n)/ \ogn is not too 
large. Our results recover and extends the previous distributional convergence results on 
maximal and near-maximal degrees in random recursive trees. 


1. Statement of results 

The process of random recursive trees (T n , n > 1) is defined as follows. T\ has a single 
node with label 1, which its root. The tree T n _|_i is obtained from T n by directing an edge 
from a new vertex n + 1 to v E [n] ; the choice of v is uniformly random and independent 
for each n E N. We call T n a random recursive tree (RRT) of size n. 

As a consequence of the construction, vertex-labels in T n increase along root-to-leaf paths. 
Rooted labelled trees with such property are called increasing trees. It is not difficult to see 
that, in fact, T n is uniformly chosen among the set X n of increasing trees with vertex set [n\. 
We write deg T ^(u) to denote the number of children of v in T n . The degree distribution 

of T n is encoded by the variables z\ n ^ = |{u E [n\ : deg Tn (v) = z}|, for i > 0. In fact, the 
study of RRT’s started with a paper by Na and Rapoport [13] in which they obtained, for 
any fixed i > 0, the convergence E (z\ n ^)/n —> 2~ l ~ 1 as n —> oo; this result was extended to 
convergence in probability by Meir and Moon in [12]. Mahmoud and Smythe [11] derived 
the asymptotic joint normality of z[ n ^ for i E {0,1, 2}; and finally, Janson [8] extended the 

(n) 

joint normality to Z\ for i > 0 and gave explicit formulae for the covariance matrix. 

The above results concern typical degrees; the focus in this work is large degree vertices, 
and in particular the maximum degree in T n , which we denote A„ = max„ e u deg T ^ (v). 
For the rest of the paper we write log to denote logarithms with base 2, and In to denote 
natural logarithms. For n £ N let e n = logn — Ll°g 7T.J. 

A heuristic to find the order of A ra is that, if E(z{"' 1 ) ss n2 - ® -1 were to hold for all 
i, as it does when i is fixed, then we would have E(zj^ g ,) ss n2 - L lo s n J- 1 = 2~ 1+Sn . 
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This heuristic suggests that A„ is of order logn. This is indeed the case: Szymanski [15] 
proved that E [A n ] / log n —► 1 as n —> oo, and Devroye and Lu [6] later established that 
A„/logn —> 1 a.s.. Finally, Goh and Schmutz [7] showed that A„ — [lognj converges in 
distribution along suitable subsequences, and identified the possible limiting laws. 

Since we focus on maximal degrees, it is useful to let 

x (n) = z^ [lQgnj = |{u e [n] : deg Tji (v) = LlognJ +i}|, 

for n £ N and i > — [lognj. The following is a simplified version of one of our main results. 

Theorem 1.1. Fix £ £ [0,1]. Let (ni)i> i be an increasing sequence of integers satisfying 
e ni —> s as l oo. Then, as l oo 

(xl n ‘\ ie z)A(P/,*eZ) 

jointly for all i £ Z where the Pf are independent Poisson r.v. ’s with mean 2 _i_1+£ . 

(n) 

The random variables X/ ' do not converge in distribution as n —► oo without taking 
subsequences; this is essentially a lattice effect caused by the floor [lognj in the definition 
of Xl n) . 

Theorem 1.1 can be stated in terms of weak convergence of point processes (which is 
equivalent to convergence of finite dimensional distributions (FDD’s); see Theorem 11.1.VII 
in [4]). In fact, we will also prove convergence (along subsequences) of 

X >i =J2 X i n) = € [n] : de gr„(^) > L lo g«J +01- 

k>i 

This is useful as it yields information about A„ which cannot be derived from Theo¬ 
rem 1.1. We formulate this result as a statement about convergence of point processes, and 
now provide the relevant definitions. Let Z* = ZU{oo}. Endow Z* with the metric defined 
by = |2 -J — 2 _l | and d(z,oo) = 2 _I for i, j £ Z. Let be the space of boundedly 

finite measures of Z*. 

Let V be a Poisson point process on R with rate function \{x) = 2~ x ■ In 2. For each 
£ £ [0,1] let V e be the point process on Z* given by 

^ = ^D+ e J ■ 

xt£V 

Similarly, for all n £ N let 

L’ 1 ' ■* = ^deg Tri (u)—LlognJ • 

vE[n] 

Then, for each i £ Z we have that 

({z}) := \{x £ V : [x + ej = i}\ = \{x £ V : x £ [i — e, i + 1 — e)}| 

has distribution Poi(2 -l-1+E ); also V^({i}) = We abuse notation by writing, e.g., 

p(»)(f) =p(") ({»}). 

It is clear that V^ and V e are elements of M.%*. The advantage of working on the state 
space to Z* is that intervals [A, oo] are compact. In particular, the convergence of FDD’s of 
■p( n i) implies the convergence in distribution of X>" d = p( n d [i, oo). 
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Theorem 1.2. Fix £ £ [0,1]. Let (ni)i> i be an increasing sequence of integers satisfying 
£ ni —^ £ as l —^ oo. Then in V lyni ' > converges weakly to V e as l —> oo. Equivalently, 

for any i < i! £ Z, jointly as l —t oo 

{X\ ni \.. . ,4-u 44) (P e «, • • •, V s {i' - 1), 7> e [i', 00)). 

Note that Theorem 1.1 follows from Theorem 1.2. We finish this section stating two 
additional results. The first is an extension of the main theorem from [7], that result being 
essentially the case i = 0(1). 

Theorem 1.3. For any i = i(n) with i + logn < 2Inn and liminfn^oo i(n) > — oo, 

P(A n > [lognj + i) = (1 - exp{—2 _i+£ti })(1 + o(l)). 

When i = 0(1), the assertion of Theorem 1.3 is a straight-forward consequence of Theo¬ 
rem 1.2. For the case that i(n) —>■ oo we use estimates for the first and second moments of 
44 note that {A„ < [lognj + i} = {X^f) = 0}. 

— t # — r n \ 

Finally, we also obtain the asymptotic normality for X) ‘ when i tends to — oo slowly 
enough. 


Theorem 1.4. If i = i(n) —> —oo and i = o(lnn), then as n —t oo 

4") - 2 _i_1+£ " d 


y /2~ i ~ 1 +Sn 


N( 0,1). 


Remark 1.5. Up to lattice effects, Theorems 1.2 and 1.4 extend the range of i 
which the heuristic that ss n2~ l ~ 1 holds. 


i(n) for 


A key novelty of our approach is that for each n we use Kingman’s coalescent to generate 
a tree TO) whose vertex degrees {deg T („> (u)} we r n ] are exchangeable but otherwise have the 
same law as degrees in T n . (See [2], Chapter 2 for a description of Kingman’s coalescent, 
and [1], Section 2.2 for a description of the connection with random recursive trees which 
we exploit in this paper.) By this we mean that if a : [n] —> [n] is a uniformly random 
permutation then the following distributional identiy holds: 

(1) (deg r< „) (v), v £ [n]) = (deg Tn (a(v)), v £ [n]). 

We describe the trees T'- n >, n £ N in Section 3. 

An essentially equivalent construction was used by Devroye [ ] to study union-find trees. 
In [14], Pittel related the results of [5] on union-find trees to the height of RRT’s. It is worth 
mentioning that both Kingman’s coalescent and the union-find trees can be equivalently 
represented as binary trees or, as we will see in Section 3, as RRT’s. Aside from the works 
[5] and [14], it seems that the use of Kingman’s coalescent or of union-find trees to study 
RRT’s is rare. However, it turns out to provide just the right perspective for studying high 
degree vertices. 


2. Outline 

In this section we sketch the approach used in the paper. The proofs of the theorems 
relay on the computation of the moments of the FDD’s of pO). these estimates are given 
in Proposition 2.1. In particular, the proofs of Theorems 1.2 and 1.4 use the method of 
moments (e.g., see [9] Section 6.1, and [3] Section 1.5). 

Any FDD of V (n) can be recovered from suitable marginals of the joint distribution of 
(4" ! \ . . . , 4'-i, 44) f° r some i < i' £ Tj. For simplicity, we focus for the moment 
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on collections of variables xj n \..., xj, n ^ for i < i!. For r £ R and a £ N write (r) a = 
r(r — 1) • • • (r — a + 1), also let (r)o = 1. We will prove that for any non-negative integers 
ai ,..., a i >, as n —> oo, we have 


( 2 ) 


E 


n ( A i n) )- 

i<.k<.i' 


n ( 2 -( fe +i)+^) afe ^ 0 . 

i<.k<.i' 


This immediately yields Theorem 1.1. 

By the linearity of expectation, proving (2) reduces to understanding the probabilities 

(3) P (deg Tji (u/c) = [log n \ +h, k£ [I<]) 

for all *i,... Ik € N and v\,...Vk G [ra], K £ N; see Section 5 for more details. 

In the standard model for RRT’s described at the beginning, deg T (y) is a sum of 

Bernoulli variables: _ 

deg7y(u) = 1 {u^v}- 

v<u<.n 

The lack of symmetry of the degrees {deg T?i (u)}t, g r n ] complicates the analysis of (3). In 
proving that A ra /logn ^4' 1, Devroye and Lu [6] used that (deg T?i (u)}„ e [ n ] are negatively 
orthant dependent (see [10] for a definition), which in particular means that for all S C [n] 
and mi,.. ., m n G N 

(4) P (deg Tri (v) >m v , v £ S) < n P (deg Ti (v ) > m v ) 

ves 


and then obtained upper bounds for P (degT n (v) > c In n) for each v G [n]. 

One approach to studying high degrees in T n would be to obtain matching lower bounds 
for P (deg Tn (i>) > m v , v G S), with uniform error terms even when m v is large. Instead, we 
study trees T^ n \ mentioned in (1), above, for which we can obtain precise asymptotics for 
the analogous probabilities 


(5) P (deg T( „) (y) > m v , v G [I\]). 

The core of the paper lies in Proposition 4.2, which gives precise estimates of (5) for 
mi,... ,m,K in a suitable range. Broadly speaking, deg r i»(u) depends on a set of random 
selection times S v and the first streak of heads in a sequence of S,, fair coin flips. As 
mentioned in the previous section, the degrees of T^ have the same distribution as the 
degrees in T n . Consequently, our estimation of (5) allows us to obtain the following moments 
estimate. 


Proposition 2.1. For all c G (0,2) and K G N there is a = a(c,K ) > 0 such that the 
following holds. Fix any integers i,i' with 0 < i + log n < i' + log n < chin. Then for any 
non-negative integers ai,... ,ar with ai + ... + = K, we have 


E 


( 4 "')<v n (4 n) k 

i<.k<.i' 


( 2 - i ' +E »V i ' JI (2~( k + 1 )+^y k (l + o(n" a )). 

i<k<i' 


Equipped with Proposition 2.1, the proofs of the theorems are straightforward. The rest 
of the paper is organized as follows. In Section 3, we explain how to define the trees T ( ' n ) 
using Kingman’s coalescent and establish the distributional relation between T( n ' > and the 
RRT; see Corollary 3.4. In Section 4, we define the random sets (S v ,v G T^d) and explain 
their relation with degrees in T^ n \ The proof of Proposition 4.2, which is our estimate of 
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(5), is then presented using a decoupling of the events in (5) and the concentration of the 
random variables |<S„|. Finally, the proof of Proposition 2.1 is given in Section 5 and the 
proof of Theorems 1.2-1.4 are in Section 6. 

3. Random Recursive Trees and Kingman’s coalescent 

In this section we give a representation of Kingman’s coalescent in terms of labelled 
forests, and relate it to RRT’s. All trees in the remainder of the paper are rooted, and we 
write r[t ) for the root of tree t. By convention, edges of a tree are directed towards the root 
of the tree and we write uv to denote an edge directed from u to v. A forest / is a set of 
trees whose vertex sets are pairwise disjoint. The vertex set of a forest, denoted V(f), is 
the union of the vertex sets of its trees. Similarly, E(f) denotes the set of edges in the trees 
of /. For n > 1, let 

= {/ : V(f) = [n]} 

be the set of forests with vertex set [n]. 

A sequence C = (/i,..., /„) of elements of F n is an n-chain if /i is the forest in F n with 
n one-vertex trees and, for 1 < i < n, ft+i is obtained from fi by adding a directed edge 
between the roots of some pair of trees in fi. If (/i,..., /„) is an n-chain then for 1 < * < n, 

the forest fi consists of n + 1 — i trees, and in this case we list its elements in increasing 

(i) (i) 

order of their smallest-labelled vertex as t\ ,..., t n f 1 _ i . 

Definition 3.1. Kingman’s n-coalescent is the random n-chain C = {F \,..., F n ) built as 
follows. Independently for each 1 < i < n — 1 let {aj,6j} be a random pair uniformly 
chosen from {{a, 6}:l<a<6<n + l — i} and let A be independent with Bernoulli(l/2) 
distribution. 

For 1 < i < n, construct Fj+i from Fi as follows. If fi = 1 then add an edge from r(T^) 

to r(Taf) and if £,i = 0 then add an edge from r(Taf) to r(T^). The forest Fj+i consists 
of the new tree and the remaining n — 1 — * unaltered trees from Fi. 

For an example of the process see Figure 1. 

Lemma 3.2. Let CF n be the set of n-chains of elements in F n . Then |C.F n | = n!(n — 1)! 
and Kingman’s n-coalescent is a uniformly random element of CT n . 

Proof. Fix an n-chain (/i,..., /„) £ CT n . Then 

n— 1 

P ((Fi, ...,F n ) = (A, ...,/„))= I] P (F k+ 1 = fk+i\Fj = fj, 1 <j<k). 

fc=l 

Among the (n + 1 — k)(n — k) possible oriented edges between roots of f k , there is ex¬ 
actly one whose addition yields fk+i- It follows that the fc-th term in the above product 
is ((n+l-fc)(n-fc)) _1 , so P((Fi,...,F„) = (/i,...,/„)) = [n!(n - 1)!] _1 . The result 
follows since this expression does not depend on (/i,..., /„) £ CT n . □ 

Recall that I n is the set of increasing trees with vertex set [n]. It is not difficult to see 
that \I n \ = (n — 1)! and that a RRT is a uniformly random element of I n . 

There is a natural mapping cf> between n-chains and increasing trees. Given an n-chain 
C = (/i,..., f n ), write t^ := t^ for the unique tree in f n . Let Lf. : E(t (")) —> [n — 1] be 
defined as follows. For each e £ E(t^), let 

L~ c {e) = max{i £ [n — 1] : e ^ Eft^)}. 
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Figure 1. An example of Kingman’s n-coalescent C = (Fi,.. ., F n ) for 
n = 6. For 1 < i < n, Ft has, in dotted line, the edge in E(Fi+i) \ 
E(Fi). Edges are marked with their time of addition; this is the function 
defined after Lemma 3.2. In this instance, £i = £3 = £4 = 1, £2 = £5 = 0 
and {(ii, bi} = {2,5}, {a 2 ,& 2 } = {1,5}, {a 3: b 3 } = {1,4}, {a 4 , 6 4 } = 
{2,3}, {a 5 ,6 5 } = {1,2}. 
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Figure 2. On the left a tree edges are marked with Lf,, from which 
the n-chain C = (/i,..., f n ) can be recovered. On the right, the increasing 
tree ..., / n ); it has the shape of t^ and the vertex labels {Lc(v), v £ 

V'(tW)}. 


We think of Lf, as a function that keeps track of the time of addition of the edges along 
the n-chain C. Now, we define a vertex labelling Lq '■ V(t^) [n] as follows. Let 

Lc( r (t (”))) = 1 and for each uv £ E[t (")), let 

Lc(u ) = n + 1 — Lq(uv)', 

then Lc{u) is the number of trees in the forest just before uv is added. 

Note that for each i £ \n — 1], the new edge in f l+ ] joins the roots of two trees in 
fi and is directed towards the root of the resulting tree. Thus, the labels {L^(e), e £ 
Eft ("))} increase along all paths in t^ towards the root r(t w) and consequently, the labels 
{Lc(v ), v £ increase along root-to-leaf paths in This shows that relabelling 

the vertices of t^ with Lc yields an increasing tree (specifically, an element of I n ). See 
Figure 2 for an example. 

Proposition 3.3. Let <j>: CT —> I n be defined as follows. For an n-chain C = (/i,..., /„) 
let (f>(C) be the tree obtained from by relabelling its vertices with Lq- Then <f(C), the 
push-forward of Kingman’s n-coalescent by (f, has the law of a RRT of size n. 

Proof. First, we prove that 4> is onto. Fix an increasing tree t £ I n . For each j £ V ( t) \ {1}, 
let Vj £ V(t) be such that jvj £ E(t), recall that edges are directed toward the root of t, 
thus Vj is uniquely defined. For each 1 < j < n, let e n -j +1 = juj . 

Now construct an n-chain C as follows. Let fi be the forest with n one-vertex trees. For 
each 1 < i < n construct fi from fi_ i by adding the edge e^-i. In other words, for each 
1 < i < n, Lg(ei) = i and so Lc(n +1 — i) = n +1 — Lf,(ei) = n +1 — i; also since r{t) = 1, 
we have Lc{ 1) = 1. Consequently, </>(C) = t. 

We claim that |^ _1 (t)| > n\ for any t £ T n . To see this, consider an n-chain C and 
a permutation cr : [n] —> [n]. Let C a be the n-chain obtained from C by permuting the 
vertices in each forest of C by er. Since Lc{v ) depends only on the time of addition of its 
outgoing edge (if any), it follows that 4>{C) = 4>(C a ) for all permutations a. By Lemma 3.2, 
this shows that <p is n!-to-l and that 4>(C) is a uniform element in I n . □ 

Since 0(C) preserves the shape of T ^ and only relabels its vertices, the degrees in T (n.) 
and <j)(C) are equal as multisets: {deg T ( n ) (i>)}«e[n] = {d e 9cj>{ c)(^)}ue[r!]- This immediately 
gives the following key corollary of Proposition 3.3, on which the rest of the paper relies. 
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Figure 3. If v is a root in Taj UT^ and ^ favours v, then v increases its 
degree and remains a root in F l+ \. 

Corollary 3.4. For all n £ N, we have the following equality in distribution holds jointly 
for all i £ Z, 

X- n) = |{u £ [n] : deg T w(v) = |k>gnj +i}|. 

We now proceed to the study of the joint distribution of the vertex degrees in T^ n \ 

4. Degree distribution: Selection sets and coin flips 

By construction, the vertex degrees {deg T (n) (u)}„ g r n ] are exchangeable. Our next goal is 
to explain how to approximate (5); that is, for any fixed k £ N and integers mi,..., m.k < 
2Inn, to obtain estimates for P (deg T (n)(v) > m v , v £ [fc]). 

The key to analyse the degrees in T^ is to understand how the degrees of a vertex v £ [n] 
change in Kingman’s coalescent C = (Fj,..., F n ). For any vertex v and 1 < i < n, denote 
deg F . (v) the number of children of v in Fi. Also, we will simply write deg(u) = deg Fri (u) = 
deg T (n)(u). For each 1 < * < n, if £* = 1 we say that favours the vertices of Taj, and 
otherwise that it favours the vertices of tJ\ For v £ [n], let 

5„ = {i€[n-i] : «erWur«}. 

For any vertex v, and 1 < i < n, deg F . +1 (u) increases by one only if v is a root in Fi, 
i £ S v and favours w, see Figure 3. Conversely, let p v = min{'t £ S v , does not favour z;}, 
then the first Fi + \ in which v is not a root is exactly i = p v . In this case, in F p „+i there is 
an outgoing edge from v, and v is not a root of any subsequent forests. As a consequence, 
deg F ^ ( v ) = deg Fpu (u) for p v < j < n. 

Fact 4.1. For v £ [n], deg(u) = deg F ^ (u) = |5„ fl \p v — 1]|. 

In other words, deg(u) depends only on its first streak of favourable random variables 
with i £ S v . More precisely, given |<S„|, the degree deg(u) is distributed as min{|S„|, G}, 
where G is a Geometric(l/2) r.v. independent of S v . 

Thus, it is relevant to observe that |5 t , | is distributed as an sum of independent (though 
not identically distributed) Bernoulli random variables and so it is concentrated around its 
mean E [|6>„ |] = 2 In n+0( 1); a more precise statement can be found in Proposition 4.5 below. 
Since |5„| —> oo in probability as n —> oo, it follows easily that deg(u) is asymptotically 
geometric for any fixed node v. More strongly, the following proposition shows that for 
any fixed k, the random variables {deg T („) (u)}„ e [fc] asymptotically behave like independent 
Geometric random variables, even if they are conditioned to be quite large. 
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Proposition 4.2. Fix c £ (0, 2) and k £ N. There exists a = a(c, k) > 0 such that 
uniformly over positive integers mi,, mk < c In n, 

P (deg T( „) ( v ) > m v , v £ [fc]) = 2“ (1 + o(n~ a )). 

We now explain how the events in the proposition above can be decoupled into a product 
of two probabilities, one of them corresponding to tail bounds for the random variables S v |. 
We start with an upper bound for Proposition 4.2. 


Lemma 4.3. For any k £ N and positive integers mi,..., mk < n, 

P (deg(u) > m v , v £ [fc]) < 2 _ ^'” m ’' P (|5«| > m v , v £ [fc]). 
Equality holds for k = 1. 


Proof. For each v £ [k\ list S v in increasing order as (i v j, 1 < j < |5 U |). Let A be the set 
of sequences A = (Ai,... ,A k ) satisfying A v C [n — 1] and \A V \ = m v for all v £ [kj. For 
every A £ A, let Da be the event that |<S„| > m v and {i v p, ..., i v ,mA = A v , for all v £ [fc]. 
By Fact 4.1, if deg(u) > m v then necessarily |<S„| > m v so 

{deg(u) > m v , v £ [fe]} n Da = favours v for all j £ [m„], v £ [/c]} n Da- 

Now, ft are i.i.cl Bernoulli(l/2) r.v.’s. Thus, if Da has positive probability then 


P (& favours v for all j £ [m„], v £ [&]|.Da) 


if \A U n A v \ = 0, Vw ^ v £ [k] 
0 o.w. 


The second case follows from the fact that if i £ S u n S v for some u ^ v, then fi cannot 
favour both u and v. The events ( Da , A £ A) are pairwise disjoint, and if deg(c) > m v for 
all v £ [k] then one of the events Da must occur. It follows that 


P (deg(u) >m v ,v£ [fc]) = ^ P (D A , deg(u) >m v ,v£ [k]) 

AgA 

< 2~'^v mv P (Da) 

AgA 

=2 - ^” P (\S V \ > m v , v £ [k]). 

Finally, the second line holds with equality when k = 1. □ 


For the lower bound we restrict to events Da where the sets A v are already disjoint. To 
do so, we consider instead the vertex degrees in Fj for some I < n. For k > 2 let 

r fc = min{i £ [n - 1] : {a», b t } C [&]}. 

Since Fj C Fj for all i < j £ [n] we have that for any I < n 

P (deg(u) >m v ,v£ [fe]) > P (deg F/+i (t;) > m v , v £ [*]) 

(6) > P (i < T k , deg F/+1 (v) >m v , v £ [k]j . 

Recall that trees in F t are listed in increasing order of their least elements; this implies 
that indices of the trees of vertices 1 ,..., k do not change until two trees indexed by a, b < 
k are merged. Therefore, for all v £ [k\, v £ T„ for i < r k - This implies the sets 
{S-u D [T k — 1], v £ [fc]} are pairwise disjoint. These observations allow us to obtain a lower 
bound analogous to Lemma 4.3. 
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Lemma 4.4. For any positive integers k > 2 and mi ,..., mu, / < n, 

P (deg(v) > m v , v £ [A;]) > 2~^ m ' , P (I < r fc , \S V n [/]| > m v ,v £ [k]). 

Proof. By (6), it suffices to bound P (l < r k , deg F/+i (i>) > m v , v £ [A;]^. 

Let A* be the set of sequences A = (A\,..., A k ) of pairwise disjoint subsets of [/] 
satisfying \A V \ = m v for all v £ [A;]. For each A £ A*, let Da be the event that for all 
v £ [Ac], {iv,j ) • • *) t'v,m v } — A v (and so O \I ] | ^ m v ). 

As in the proof of Lemma 4.3, we have that 

{deg Fj+i ( v ) > m v , v £ [k]} n Da = favours v for all j £ [m„], v £ [ k ]} n Da- 

In this case, the sets A v are pairwise disjoint. If P (Da) > 0 then 

P favours v for all j £ [m v \, v £ = 2~^'’ rriv . 

Recall that I < r k if and only if the sets {S v n [/], v £ [ k ]} are pairwise disjoint; that is, 
if one of the events Da occur. We then have 

P (l < Tfc, deg Fj . +1 (i>) >m v ,v£ [A;]) = ^ P [d a , deg F/+1 (u) > m„, v £ [A;]) 

AeA* 

= J2 rE " m ’ p N 

AeA* 

=2-^ m -P(I< Tk , \S v n[I]\>m v ,v£[k}). □ 

To use Lemma 4.4 we need tail bounds for |5„ n [/] | for some suitable I < n; these are 
provided by the following proposition. 

Proposition 4.5. Fix £ £ (0,1) and c £ (0,2(1 — e)). Then there exists 0 = 0 (c, e) > 0 
such that for any vertex v, 

P (|«S« (~1 [n — |~n £ "|]| < cln?z) = o(n~^). 

Proof. Fix £ £ (0,1) and c £ (0,2(1 — e)). Let {£?,;, i £ N} be a collection of independent 
Bernoulli r.v.’s, with E [Bf\ = |. Recall the definition of S v at the beginning of the section. 

For any fixed vertex v £ [n], and each i £ [n — 1], the probability of the event {v £ 
Ta} U T^} is 2/(n — i + 1); this is because, in the forest Fj, there are n — i + 1 trees and 
the trees T a f, Tf are chosen uniformly at random among them. Since each of these events 

are independent we have |<S>„| = ^^=2 -®»- Moreover, writing W njS = Y(ii=n-\n e ] we also 
have 

Wn iE = I S v n [n - \n e ~\]\. 

We now apply Bernstein’s inequality (see, e.g., [9], Theorem 2.8) to obtain that for any 
t > 0, 

We take t = E [II / n , e ] — clnn. Since 

n 2 

E [W n , e ]= J2 J =2(1- e) Inn + 0(1), 

i=n— |"n e ] 

setting 5 = 2(1 — e) — c > 0 we have t = <5Inn + 0(1), so 

P (\S V n [n - fn e l]| < chin) = P (W n , e < E [W n , e ] -t) = 0(1) • n -* 2 /( 4 (i-0). 
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Choosing 0 < /3 < S 2 / 4(1 — e), the result follows. □ 

The following lemma is the last ingredient for Proposition 4.2. 

Lemma 4.6. Fix an integer k > 2 and let e £ (0,1). Then, for n large enough, 

_ . r 2 k 2 

Proof. By the definition of 7>, if t*, > n— \n £ ] then {m, hi] fL [A:] for all 1 < i < n— \n E ] . The 
events that {a*, hi] <£_ [fc] are independent for distinct i and P ({a*, bf] C [A;]) = ( w+ y-()("-b ; 
so we have that 


1 


P (r fc > n - = J| 


1 - 


k(k~ 1) 


(n + 1 — i)(n — i) 


i-Kl 


E 


2fc 2 


“ (n - i) 2 
2 = 1 V 7 


The last inequality holds for n large enough. Since 2 < fm-i x 2< ^ x = ( TO — 1) \ 

we get 

" _ r« E l 9 7.2 00 op op 

P(T fc < n- [n E ]) < E ---^ < J2 — = --■ D 

z -—' (n — l) Z 7 


i =1 


j— r«c 

We finish this section with the proof of Proposition 4.2. 


\n s ] - 1' 


Proof of Proposition 4.2. Fix c £ (0,2), fc £ N and let mi,...,TOfe < chin be positive 
integers. Let e = (2 — c)/4 so that Proposition 4.5 holds for some /3(c) = /3(c, e) > 0. For 
k = 1, the result follows from the equality in Lemma 4.3 and Proposition 4.5 since 

P (|«Si| < mi) < P (|<Si (~1 [n — |~n E ]]| < clnn) = o(n~' 3 ). 

For k > 2, the upper bound is likewise established immediately by Lemma 4.3. For the 
lower bound, letting I = n — [~n e ], by Lemma 4.6 and Proposition 4.5 we have 

P (/ < Tfc, |5„ n [7] I > m v ,v £ [A;]) > 1-P (7 > T k ) - E P (|5„ n [7]| < m v ) > 1 -o{n~ a ), 


where a < min{/3, e]. By Lemma 4.4, it follows that 

P (deg(n) > m v , v £ [A;]) = 2“^ (1 + o(n _ “)), 

as required. 


□ 


5. Proof of Proposition 2.1 

By Corollary 3.4 we can study vertex degrees in T ^ and derive conclusions about the 
variables xj n \x^]\ i £ Z. Recall that we write deg(u) = deg T („)(u), for v £ [n]. 

Lemma 5.1. For any k £ N and integers mi,..., m k, 

k 

P (deg(u) = m u , u £ [A:]) = E E ( de s( u ) > m u + l> e s]> u £ [A;]) . 

i —0 sc[fcj 

\s\=j 
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Furthermore, for k' £ N and integers irik+i, ■ ■ ■ rrik+k', 

P (deg(u) = m u , deg(u) > m v , l<u<k<v<k + k') 
k 

= E E ( _1 )' 7p ( de g( w ) ^ m v + v £ [k + k ']) . 

j =o SCW 

\S\=j 

Proof. The second equation follows by intersecting the event |deg(z;) > m v , k < v < k + k’} 
along all probabilities in the first equation. The first is straightforwardly proved using the 
inclusion-exclusion principle. □ 

We are now ready to prove Proposition 2.1. 

Proof of Proposition 2.1. Let c £ (0,2) and K £ N. Let i < i! be integers such that 
0 < i + log n < i' + log n < clnn and let aj, i < j < i' be non-negative integers with 

a,i H- ay = K. We are interested in the factorial moments E (X>”|) a ., YliKkKi'i-^k^^ak ■ 

For i < k < i', for each v with a i < v < Y^*i=i ai ^ = l_l°g n J + Let 

K' = K — Oj/, by Corollary 3.4 and the exchangeability of the vertex degrees of T^ n \ 


E 


II ( 4 n) ) 0 




= (n)^P (deg (u) = m Ul deg(v) > m Vl 1 < u < K r < v < K) 


K' 


= M^E E ( _1 ) ip ( de s(^) >m v + l[„ e s], v £ [K]) 


l —0 sc[jc'] 

\S\=l 

the last equality by Lemma 5.1. At this point we can apply Proposition 4.2 to each of the 
terms. Since m v < clnn for v £ \K), there is a' = a'(c,K) > 0 such that 

K' 

E E M) ip (deg(u) > m v + l[„ eS ], v £ [K]) 

l —0 SC[K'] 

I S\=l 
K' 


= E E (-l)'2 -i -^ m «(l + o(n"“')) 


l —0 SC[RT'] 

I S\=l 


K' 




l —0 SC[RT'] 

I S\=l 


=2 - A "-^ m -(l + o(n-“')). 
Using that (ti)k = n K (1 + o(n -1 )), we get 


E 


n c x k)o 




= 2 Kl ° en - K '-^v= 1 mv (1 + o(n~ a )); 
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where a = min{a / , 1}. Finally, to complete the proof, note that 

K K K' 

K log n — K' — ^2 m v = ^ (log n — m v ) + ^ (log n — 1 — m v ) 


V — l 


v=K' +1 


v—l 


i'-l 

= (~ i’ + £n) a i‘ + { — k — 1 + £ n )dk- 

k=i 

6. Proofs of the main theorems 


□ 


Proof of Theorem 1.2. By Theorem 11.1.VII of [1], weak convergence in Adfi is equivalent 
to convergence of FDD’s, that is, convergence of every finite family of bounded continuity 
sets; see Definition 11.1.IV of [ ]. For any point process £ on Z and any * £ Z, we have that 
Z D [*,oo) is a bounded stochastic continuity set for the underlying measure of £ in 
Thus, any FDD of £ can be recovered from suitable marginals of the joint distribution of 
(£(*), ...,£(*— 1'), £[i, oo)) for some i < i! £ Z. 

Let e £ [0,1] and (rq)i>i be an increasing sequence with e ni —► e. The goal then is to 
prove that, for any integers i < i ', the joint distribution of 




y( n l) yi n i) 


converges to the joint distribution of 


that is, to the law of independent Poisson r.v.’s with parameters 2 _I_1+e ,..., 2~ l ~ 2+e , 2~ l +e . 

We compute the limit of the factorial moments of x\ ni \ ..., X^f^, Xfff). For any non¬ 
negative integers ctj,..., <v, by Proposition 2.1, 


= (2~ l ' +e ™y i ' JJ y-( k +P+^y k (1+ 0 („-“)) 

i<k<i' 

^(2- < '+ e ) o< ' n (2-( fc + i )+ e )° fc , 

i<k<i' 

as ni — > oo. The limit correspond to the factorial moment 


E 


(4"'W n (** 

i<Lk<.i' 


(«)> 


E 


00)) Q J I] (P £ (fc)) afc 

i<k<i' 


The result follows (by, e.g. Theorem 6.10 of [9]). 


□ 


Proof of Theorem 1.3. Since {A„ > [lognj + i} = {X>) > 0}, we need only to estimate 
P > 0^. If i = 0(1), then exp{— 2~ l+Sn } = 0(1) and so it suffices to prove that 

P (xg } = 0) - exp{—2 _i+e "}) -A 0, 

as n —» oo. This follows from Theorem 1.2 and the subsubsequence principle. Suppose that 
there exists S > 0 and a subsequence n for which |P (x^^ = 0^) — exp{— 2~ l+£ri k }| > S. 
Since {e nk }k >l is a bounded set there is a subsubsequence nk t such that £ nki —> e for 
some £ £ [0,1]. By Theorem 1.2, P (^X^^ = 0^ —A exp{— 2~ l+e }- 1 this contradicts our 
assumption on the subsequence rik- 
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Now consider the case i —> oo with i + log n < 2 In n. By a standard inclusion-exclusion 
argument (see, e.g., [3] Corollary 1.11), 


(7) 


n 

p (x£> = o) = £(-l)’ 


E 




r =0 


and this sum has the so called alternating inequalities property; this means that partial 
sums alternatively serve as upper and lower bounds for P ^X>7 = 0^. Consequently 1 , 


( 8 ) 


E 


X 


(n) 

>i 


— -E 

2 




< P ( X™ > 0 ) < E 


Using Proposition 2.1 and the fact that i —>■ oo, we have that E 
and 


X 

X- 


>i 

(n) 


>i 


E 


X. 


- -E 

2 


(^ 7)2 


The result follows. 


= 2- < + e »(l + o(l)) 
= 2~ i+Sn (1 + o(l)) = (1 - exp{—2 _i+En })(l + o(l)). 

□ 


Proof of Theorem l.f. We again use the method of moments. By Theorem 1.24 of [3], it 
suffices to prove that, as n —> oo 


(9) 


E 


(Xl n) )a - (2 


— i— 1+Sn^a _ 2~(i+i —e 


° 6 ). 


for all fixed 1 < a < b. Since i = o(lnn), we have that 2 1 1+e " = n 0 ^. On the other 
hand, by Proposition 2.1 there is a > 0 such that 

(Xf^j - (2- i - 1+£ ") a = o(n~ a 2 -(*+ £ ") a ) = n ~ a+o{1) = o(n o(1) ). 


E 


Therefore, condition (9) is satisfied and the proof is complete. 


□ 
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